Abstract
Background: Although gene editing by CRISPR/Cas9 is a promising curative strategy for inherited and acquired diseases, potential off-targets remain a major concern. Most computational and biochemical methods for off-target search have focused on single genomes, not considering the impact of genetic diversity among human populations. Previous studies showed that genetic variants might significantly impact on- and off-target specificity. However, none of them explored the most extensive worldwide human genetic diversity dataset to examine the impact of population genetic variation on therapeutically relevant guide RNAs (gRNAs).
Objective: To evaluate how genetic variation among human populations affects the predicted off-targets for gRNAs that have been used in clinical trials or had consistent results in preclinical studies.
Methods: We selected 13 gRNAs tested for sickle cell disease (SCD), severe combined immunodeficiency, chronic granulomatous disease, X-linked hyper IgM syndrome, acquired immunodeficiency syndrome, and Leber congenital amaurosis. Off-target prediction was performed against the reference genome (hg38) and against population-specific genomes using CRISPRitz (https://github.com/pinellolab/CRISPRitz). Genomic variants with allele frequencies ≥1 % were obtained from both the entire gnomAD v3.0 dataset (76,156 whole-genome sequences) and each ethnic group separately. Off-target sequences were filtered considering up to four mismatches and bulges, and a cutting frequency determination (CFD) score ≥0.2 for those harbouring only mismatch events. Sequences were annotated with Annovar (https://annovar.openbioinformatics.org/).
Results: A total of 1,727 off-target events were found against the reference genome for the 13 selected gRNAs. 134 additional events were detected with the gnomad_all dataset. When variants from different ethnic groups were considered 234 additional off-targets were observed. The distribution of CFD values was comparable between the population-specific/additional and the reference genome off-targets, as well as the distribution of their genomic localization. Most events mapped to intergenic regions (~50%), followed by intronic regions (~35%). The proportion of additional events varied for the different gRNAs (ranging from 7.8% to 27.7% increase), and the number of events was also variable among the populations. Considering the different ethnic groups, the highest number of events was found in African-ancestry populations (164) and the lowest in the non Finish European population (87), implying that some off-target events were found exclusively in one of the populations (91 in total). Although 219 (75%) of the variants within the population-specific/ additional off-targets have a frequency between 1-10%, 20 (6.8%) have a frequency higher than 80% in at least one of the populations analysed. Considering only the gRNAs used in gene therapy for SCD (n=6), 100 additional predicted off-targets were found in the African population of which 29 are exclusive, while only 77 were found in gnomad_all and 43 in the non-Finish European population, being 2 exclusive.
Discussion: Rigorous analysis regarding off-target events is a necessary step to address the safety of gene therapy strategies. Although in silico analysis against the reference genome captures most predicted off-target events, we and others have demonstrated that an important proportion of additional events are lost if population-specific variants are not considered. Herein, we showed that the number of additional events is particularly higher when variants with allele frequencies >=1% in African populations are considered. This might occur because the African population is more genetically diverse and the reference genome was assembled based on Caucasian individuals. In addition, several off-target events are exclusively found in non-African populations, showing the importance of considering the genetic background of patients who undergo gene therapy. Additional caution should be taken for diseases that are more frequent in specific populations, such as SCD and β-thalassemia, for which there are several ongoing gene therapy trials.
No relevant conflicts of interest to declare.